8a7c1bf7 |
1 | In Fall of 2006 I did a small project on Metaobject Protocols for my |
2 | CS 331 class. Here lie my notes which may perhaps be useful to |
3 | others. I hope to expand them into something more useful over time. |
4 | |
5 | * Background |
6 | |
7 | ** Object Protocols |
8 | |
9 | An object protocol is a set of methods and specification of the |
10 | interactions between the methods which provide some generic behavior |
11 | (e.g. of a sequence) that are then implemented by classes which |
12 | conform to the protocol (e.g. a vector or list). In most object |
13 | systems a class contains both the methods which implement a protocol |
14 | and the data used by the implementation. The intent is to emulate |
15 | state machines which pass messages between each other. |
16 | |
17 | ** CLOS Way of OO |
18 | |
19 | The Common Lisp Object System (CLOS) is different. It separates |
20 | the data and method concepts into classes and generics. A class |
21 | contains data fields only, and a generic has methods specialized for |
22 | certain types attached to it. This seems a bit weird at first, but is |
23 | significantly more powerful as it encourages complete encapsulation |
24 | through its use of classes primarily for method specialization rather |
25 | than for state storage. |
26 | |
bdbaa8e0 |
27 | *** Classes for Scratch Data and Types |
8a7c1bf7 |
28 | |
29 | In CLOS classes store data in slots (which are the same as data |
30 | members). Encapsulation is not provided; any bit of code can use |
31 | =slot-value= to access or set the value of a slot. This may seem odd at |
32 | first, but encapsulation is of questionable importance as the slots |
33 | are meant only to be used by the protocol defined around the class. |
34 | |
bdbaa8e0 |
35 | Classes are defined with =defclass= |
8a7c1bf7 |
36 | |
37 | <src lang="lisp"> |
38 | (defclass name (superclasses ...) |
39 | ((slot-name :accessor slot-accessor ...) |
40 | ...) |
41 | (class-options ...)) |
42 | |
43 | (defclass example () |
44 | ((foo :accessor foo-of :initform 5))) |
45 | |
46 | (defclass example-child (example) |
47 | ((bar :accessor bar-of :initform (list 1 2 3)))) |
48 | </src> |
49 | |
b796d643 |
50 | Slot definitions have several options; the above example shows only the |
8a7c1bf7 |
51 | =:accessor= and =:initform= options which are the most commonly |
52 | used. =:accessor= generates an accessor for the slot (e.g. if you have |
bdbaa8e0 |
53 | an instance of =example= you can =(setf (foo-of some-example-instance) |
54 | 'some-value)= to set and =(foo-of some-example-instance)= to access the |
8a7c1bf7 |
55 | value). =:initform= provides a default initial value for the slot as a |
bdbaa8e0 |
56 | symbolic expression to be evaluated when an instance is created in the |
57 | lexical environment of the class definition. |
8a7c1bf7 |
58 | |
bdbaa8e0 |
59 | *** Generics with Methods that Implement Protocols |
8a7c1bf7 |
60 | |
61 | Generics are like normal functions in Lisp, but they only provide a |
62 | lambda list (parameter list). Methods are added to the generic which |
bdbaa8e0 |
63 | specialize on the types of their parameters and provide an |
64 | implementation. This allows writing rich layered protocols which can |
65 | enable selective modification of individual facets with minimal code. |
8a7c1bf7 |
66 | |
67 | <src lang="lisp"> |
68 | (defgeneric generic (parameters ...) |
69 | (options) ...) |
70 | |
71 | (defmethod generic-name ((parameter type) parameter ...) |
72 | "documentation string" |
73 | body) |
74 | |
75 | (defgeneric foo (bar baz quux) |
76 | (:documentation "Process the baz with the quux capacitor to make the |
77 | foo widget fly into the sky at warp speed")) |
78 | |
79 | (defmethod foo ((bar example) baz (quux capacitor)) |
80 | (launch bar (process-with quux baz))) |
81 | </src> |
82 | |
83 | A method lambda list differs from a normal lambda list only in that it |
84 | can specify the type of the parameter using the notation =(name type)=. |
85 | Note also that methods can specialize on the types of every |
86 | argument and not just the first one. This is quite powerful for |
87 | reasons outside of the scope of this presentation. |
88 | |
89 | * Limitations of Default Language Behavior |
90 | |
91 | The behavior of a language is a compromise between many competing |
bdbaa8e0 |
92 | issues that attempts to be as generally useful as possible so that |
93 | *most* applications will have no issue with the default behavior. There |
94 | are, however, certain applications that could be cleanly written with |
95 | minor modifications to the behavior of the language, but would be |
96 | impossible or quite difficult to write otherwise. |
8a7c1bf7 |
97 | |
98 | ** Slot Storage |
99 | |
100 | Most languages choose to preallocate storage for all of the slots of |
bdbaa8e0 |
101 | an instance. Now imagine a contact database that stores information |
102 | about people in slots of a class. There may be dozens of slots, but |
103 | often many of them will be left blank. If slot storage is preallocated |
104 | much memory will be wasted and the database may not be able to fit |
105 | into the memory of the hardware it must run on (perhaps for financial |
106 | reasons, huge datasets, etc.). |
8a7c1bf7 |
107 | |
108 | To save memory the author of the contact database must implement his |
109 | own system to store properties and allocate them lazily. This |
110 | represents a fair bit of effort, and would implement a system that |
bdbaa8e0 |
111 | differed from the existing slot system of classes only regarding slot |
112 | storage. |
8a7c1bf7 |
113 | |
bdbaa8e0 |
114 | It would be useful if there were a way to customize slot allocation in |
115 | instances. The customizations would be minor and require overriding |
8a7c1bf7 |
116 | only the initial allocation behavior and the behavior of the first |
117 | assignment to the slot. It is a a trivial problem in a language that |
bdbaa8e0 |
118 | allows customization of these behaviors. |
8a7c1bf7 |
119 | |
120 | ** Design Patterns |
121 | |
122 | Design Patterns are generalized versions of common patterns found in |
123 | programs. Many of them are merely methods to get around deficiencies |
124 | in the language, and can be quite messy to implement in some |
bdbaa8e0 |
125 | languages. Ideally a pattern would be subsumed by the language, but |
b796d643 |
126 | real world constraints require language standards to remain fairly |
bdbaa8e0 |
127 | static. |
8a7c1bf7 |
128 | |
129 | * Metasoftware |
130 | |
131 | Some types of programs could be written easily if the language were |
bdbaa8e0 |
132 | customizable but are nearly impossible to write when it is not. |
8a7c1bf7 |
133 | |
134 | ** Runtime Generated Classes |
135 | |
136 | Say you wanted to write a video game where players could create their |
137 | own objects, attach behaviors to the objects, and perhaps mix |
138 | different objects together to create new ones. When you abstract the |
139 | problem this looks just like an object system! Wouldn't it be nice if |
bdbaa8e0 |
140 | your program could create new classes and methods on the fly portably? |
8a7c1bf7 |
141 | |
142 | ** Object Inspection |
143 | |
bdbaa8e0 |
144 | Imagine you were developing a complicated program with many different |
145 | objects that interacted in fairly complex ways. A tool to inspect the |
146 | structure of objects while debugging would be quite useful, but in a |
147 | traditional language would be impossible to implement portably. This |
148 | could force you to purchase a certain compiler implementation which |
149 | provided an inspector, and even then would likely not be customizable. |
8a7c1bf7 |
150 | |
151 | This problem can be generalized to apply to most debugging tools; it |
152 | would be useful to write such tools portably because users of the |
153 | *language* and not the *compiler* need to debug software. Sharing |
154 | infrastructure would result in better tools (more developers), and |
bdbaa8e0 |
155 | save the man-years of wasted effort that comes with having to rewrite |
156 | unportable tools from scratch multiple times. |
8a7c1bf7 |
157 | |
158 | * Metaobject Protocols |
159 | |
160 | ** Limited/Generalized Internals of the Implementation |
161 | |
bdbaa8e0 |
162 | A Metaobject Protocol (MOP) is a generalized and limited subset of the |
163 | underlying language implementation. It is limited to allow multiple |
164 | implementation strategies; this, along with careful design, is |
165 | essential because programming language research is ever advancing and |
166 | new techniques for creating more reliable and faster implementations |
167 | are still being discovered. |
8a7c1bf7 |
168 | |
169 | This subset of the implementation is exported as a set of methods on |
bdbaa8e0 |
170 | metaobjects. Thus the language is implemented in itself. The system |
171 | can then be customized using the extension and overriding features of |
172 | the language itself. |
8a7c1bf7 |
173 | |
174 | ** Classes of MOPs |
175 | |
176 | *** Reflective |
177 | |
bdbaa8e0 |
178 | A reflective MOP provides an interface to information *about* the |
179 | running system. It exposes class relationships, the methods attached |
180 | to a generic, etc. A reflective MOP often provides some functionality |
181 | for creating new classes at runtime. Smalltalk was one of the first |
182 | languages to expose a reflective MOP. |
8a7c1bf7 |
183 | |
184 | **** Example: Object Inspector |
185 | |
8a7c1bf7 |
186 | <src lang="lisp"> |
187 | (defgeneric example-inspect (instance) |
188 | (:documentation "Simple object inspector using CLOS MOP")) |
189 | |
190 | (defmethod example-inspect ((instance t)) |
191 | (format t "Simple Object~% Value: ~S~%" instance)) |
192 | |
193 | (defmethod example-inspect ((instance standard-object)) |
194 | (let ((class (class-of instance))) |
195 | (format t "Class: ~S, Superclasses: ~S~%" |
196 | (class-name class) |
197 | (mapcar #'class-name |
198 | (class-precedence-list class))) |
199 | (let ((slot-names (mapcar #'slot-definition-name |
200 | (class-slots class)))) |
201 | (format t "Slots: ~%~{ ~S~%~}" slot-names) |
202 | (inspect-loop slot-names instance #'example-inspect)))) |
203 | |
204 | (defun inspect-loop (slots instance inspector) |
205 | (format t "Enter slot to inspect or :pop to go up one level: ") |
206 | (finish-output) |
207 | (let* ((slot (read)) |
208 | (found-slot (member slot slots))) |
209 | (cond (found-slot |
210 | (funcall inspector (slot-value instance slot)) |
211 | (funcall inspector instance)) |
212 | ((eq slot :pop) t) |
213 | (t |
214 | (format t "~S is invalid. Valid slot names: ~S~%" |
215 | slot |
216 | slots) |
217 | (inspect-loop slots instance inspector))))) |
218 | </src> |
219 | |
bdbaa8e0 |
220 | **** Example: Runtime Generated Classes and Methods |
221 | |
222 | *** Intercessory |
223 | |
224 | Intercessory MOPs allow the user to customize language behavior by |
225 | implementing methods which override certain aspects of the language |
226 | behavior. This class of MOPs are what make MOPs especially |
227 | powerful. No longer must a problem be restructured to fit the |
b796d643 |
228 | implementation language; the underlying language can be reshaped to |
bdbaa8e0 |
229 | fit the task at hand, and obfuscation of the intended structure of the |
230 | application can be avoided. |
231 | |
232 | **** Example: Lazily Allocated Slots |
233 | |
234 | **** Example: Observer Design Pattern |
8a7c1bf7 |
235 | |
236 | A simple implementation of the observer pattern is under 100 lines, |
237 | and the user level code requires only a single line of code to make |
238 | any existing class observable. |
239 | |
240 | In a language lacking a MOP, implementing the observer pattern |
241 | requires modifying every accessor of a class to explicitly invoke any |
b796d643 |
242 | observers, and necessitates the addition of a mixin class to the class |
243 | hierarchy. The fact that an object can be observed is a meta property |
8a7c1bf7 |
244 | of the class, and forcing it to be implemented at the application |
b796d643 |
245 | level dirties the inheritance hierarchy and adds unnecessary meta |
8a7c1bf7 |
246 | details to the program. |
247 | |
248 | <src lang="lisp"> |
249 | ;;; This metaclass adds a slot to instances which use it, and so the |
250 | ;;; system is defined in its own package to avoid name conflicts |
251 | (defpackage :observer |
b796d643 |
252 | (:use :cl :c2mop) |
8a7c1bf7 |
253 | (:export observable register-observer unregister-observer)) |
254 | |
255 | (in-package :observer) |
256 | |
257 | ;;; Metaclass |
258 | (defclass observable (standard-class) |
259 | () |
260 | (:documentation "Metaclass for observable objects")) |
261 | |
262 | (defmethod compute-slots ((class observable)) |
263 | "Add a slot for storing observers to observable instances" |
264 | (cons (make-instance 'standard-effective-slot-definition |
265 | :name 'observers |
266 | :initform '(make-hash-table) |
267 | :initfunction #'(lambda () (make-hash-table))) |
268 | (call-next-method))) |
269 | |
270 | (defmethod validate-superclass ((class observable) |
271 | (super standard-class)) |
272 | t) |
273 | |
274 | (defun register-observer (instance slot-name key closure) |
275 | (register-observer-with-class (class-of instance) |
276 | instance |
277 | slot-name |
278 | key |
279 | closure)) |
280 | |
281 | (defun unregister-observer (instance slot-name key) |
282 | (unregister-observer-with-class (class-of instance) |
283 | instance |
284 | slot-name |
285 | key)) |
286 | |
287 | (defun get-observers (instance slot-name) |
288 | (get-observers-with-class (class-of instance) |
289 | instance |
290 | slot-name)) |
291 | |
292 | (defun add-observer-table (instance slot-name) |
293 | (setf (gethash slot-name (slot-value instance |
294 | 'observers)) |
295 | (make-hash-table))) |
296 | |
297 | (defgeneric register-observer-with-class (class instance slot-name key closure)) |
298 | (defgeneric unregister-observer-with-class (class |
299 | instance |
300 | slot-name |
301 | key)) |
302 | |
303 | (defmethod register-observer-with-class ((class observable) |
304 | instance |
305 | slot-name |
306 | key |
307 | closure) |
308 | (setf (gethash key |
309 | (or (gethash slot-name |
310 | (slot-value instance 'observers)) |
311 | ;; Lazily add observer hash tables |
312 | (add-observer-table instance slot-name))) |
313 | closure)) |
314 | |
315 | (defmethod unregister-observer-with-class ((class observable) |
316 | instance |
317 | slot-name |
318 | key) |
319 | (remhash key (gethash slot-name |
320 | (slot-value instance 'observers)))) |
321 | |
322 | (defmethod get-observers-with-class ((class observable) |
323 | instance |
324 | slot-name) |
325 | (gethash slot-name (slot-value instance 'observers))) |
326 | |
327 | (defmethod (setf slot-value-using-class) :before (new-value |
328 | (class observable) |
329 | instance |
330 | slot) |
331 | (let ((slot-name (slot-definition-name slot))) |
332 | (if (not (eq 'observers slot-name)) |
333 | (let ((observers |
334 | (get-observers instance (slot-definition-name slot)))) |
335 | (if observers |
336 | (maphash #'(lambda (key observer) |
337 | (funcall observer |
338 | (if (slot-boundp instance slot-name) |
339 | (slot-value instance slot-name) |
340 | nil) |
341 | new-value)) |
342 | observers)))))) |
343 | </src> |
344 | |
bdbaa8e0 |
345 | |
346 | ** Violation of Encapsulation? |
347 | |
348 | A MOP may seem like a violation of encapsulation by revealing some |
349 | implementation details, but in reality a well designed protocol does |
350 | not reveal anything which was not already exposed. Implementation |
351 | decisions affect users, and some of these details do leak through to |
352 | higher levels (e.g. the memory layout of slots). Implicit in the |
353 | protocol specification are these implementation details, and the MOP |
354 | merely makes this limited subset available for customization. |
355 | |
356 | A MOP makes it possible to customize certain implementation decisions |
357 | that do not **radically** alter the behavior of the base language. The |
358 | conceptual vocabulary of the system retains its meaning, and so code |
359 | written in one dialect can interact with code written in another |
360 | without knowing that they speak different ones. |
361 | |
362 | * MOP Design Principles |
363 | |
364 | ** Layered Protocol |
365 | |
366 | A layered protocol design is good for both meta and normal object |
367 | protocols, and enables a combinatorial explosion of customizations to |
368 | the protocol. |
369 | |
370 | *** Top Level **Must** Call Lower Level Methods |
371 | |
372 | The top level methods of a layered protocol are required to call |
373 | certain lower level methods to perform some tasks. This both makes it |
374 | easier to customize the top level methods (which perform very broad |
375 | tasks) by providing some pieces of implementation for the programmer, |
376 | and enables more customization by opening up the replacement of lower |
377 | level functions as a way to alter a small detail of the high level |
378 | behavior. |
379 | |
380 | *** Lower Level Methods are Easier to Customize |
381 | |
382 | The lower level methods of a MOP are limited in scope and can be |
383 | implemented easily. Often the desired changes to language behavior are |
384 | minor, and having methods that perform simple tasks which are often |
385 | customized reduces the effort required to extend the system. |
386 | |
387 | ** Functional Where Possible |
388 | |
389 | Functional protocols are preferred for MOPs (and object protocols in |
390 | general). Functional protocols open up several optimizations for the |
391 | implementation without burdening the user of the protocol. |
392 | |
393 | *** Memoization |
394 | |
395 | Memoization is the process of saving the results of a function call |
396 | for future use. This avoids expensive recomputation of values which |
397 | have not changed (recall that a true function will always return the |
398 | same result when given the same arguments). |
399 | |
400 | A functional MOP can be optimized easily by exploiting this property |
401 | to memoize the return values of calls to expensive operations. A MOP |
402 | must be be very fast to avoid making programs unusably slow, and |
403 | memoization is able to give an appreciable speedup in many cases |
404 | without a significant burden on memory usage. |
405 | |
406 | *** Constant Shared Return Values |
407 | |
408 | Disallowing modification of values returned by protocol methods allows |
409 | the implementation to return large data structures by reference to |
410 | avoid expensive copying without having to do expensive data integrity |
411 | checks or copying. |
412 | |
b796d643 |
413 | ** Procedural Only Where Necessary |
bdbaa8e0 |
414 | |
b796d643 |
415 | Some operations like method invocation are inherently stateful and so |
bdbaa8e0 |
416 | must use a procedural protocol. There is no benefit to be gained from |
417 | using a functional protocol, and indeed an attempt would result in |
b796d643 |
418 | obtuse code that severely restricted the implementation. Do note that |
bdbaa8e0 |
419 | only a very small part of method invocation is stateful (the actual |
420 | call), and most of it can be implemented functionally (e.g. computing |
421 | the discriminating function). |
422 | |
8a7c1bf7 |
423 | ** Real World |
bdbaa8e0 |
424 | |
8a7c1bf7 |
425 | *** [[http://common-lisp.net/project/ucw/][UCW]] and [[http://common-lisp.net/project/bese/arnesi.html][Arnesi]] |
426 | |
b796d643 |
427 | Arnesi uses the CLOS MOP to implement methods which are transparently |
8a7c1bf7 |
428 | rewritten into continuation passing style. This allows their execution |
429 | to be suspended at certain points and resumed later. UCW builds on top |
430 | of this to support a web framework where the statelessness of http is |
431 | hidden from the user; displaying a page suspends the execution of the |
432 | current continuation, and resumes it upon submission. The user level |
433 | code is completely unaware of this. |
434 | |
435 | *** [[http://clsql.b9.com][CLSQL]] |
436 | |
437 | CLSQL uses the reflective part of the CLOS MOP to map Common Lisp data |
438 | types into SQL types, and the intercessory protocol for slot |
439 | allocation to map slots onto database columns or sql expressions (for |
440 | implementing relational slots). |
441 | |
442 | *** [[http://common-lisp.net/project/elephant/][Elephant]] |
443 | |
b796d643 |
444 | Elephant uses the CLOS MOP to transparently store any class to disk |
bdbaa8e0 |
445 | and handle paging between the disk store and memory efficiently |
446 | without user intervention. |
8a7c1bf7 |
447 | |
62e96183 |
448 | * Sources and Further Reading |
8a7c1bf7 |
449 | |
450 | ** Sources |
451 | |
452 | *** The Art of the Metaobject Protocol |
bdbaa8e0 |
453 | |
8a7c1bf7 |
454 | **** Kiczales, Gregor et al. MIT Press 1991 |
455 | |
456 | Highly recommended reading even if you plan to never implement a MOP |
457 | or use the CLOS one. The design principles it recommends are quite |
458 | useful. |
459 | |
460 | *** [[http://www.lisp.org/mop/contents.html][CLOS MOP Specification]] |
461 | |
462 | Specification of the MOP for CLOS defined in *The Art of the Metaobject Protocol*. |
463 | |
464 | *** [[http://citeseer.ist.psu.edu/399658.html][Metaobject Protocols: Why We Want Them and What Else They Can Do]] |
465 | |
466 | A short overview of MOP design principles followed by three example |
467 | metaobject protocols for Scheme. |
468 | |
469 | *** [[http://www2.parc.com/csl/groups/sda/projects/oi/towards-talk/transcript.html][Why Are Black Boxes so Hard to Reuse?]] |
470 | |
471 | Transcription of a talk on the benefits of open implementations of |
472 | software. It first discusses several problems that black box software |
473 | implementations pose, and then presents existing solutions. It shows |
474 | how the existing solutions are insufficient, and then provides |
475 | metaobject protocols as a solution to most of the problems. |
476 | |
477 | ** Further Reading |
478 | |
479 | *** [[http://citeseer.ist.psu.edu/chiba95metaobject.html][A Metaobject Protocol for C++]] |
480 | |
481 | Example of a purely compile time MOP. It implements the functionality |
482 | of a code walker and something similar to the Lisp macro system. |
483 | |
484 | *** [[http://www.parc.com/csl/groups/sda/publications/papers/Kiczales-TUT95/for-web.pdf][Open Implementations and Metaobject Protocols]] |
485 | |
486 | It is a bit long, but it seems to follow a similar structure to AMOP |
487 | in introducing MOPs and their usefulness. The pages are slides with |
488 | notes, and so the 331 pages might not actually take that long to read. |
b796d643 |
489 | |
490 | ** Software |
491 | |
492 | *** [[http://common-lisp.net/project/closer/closer-mop.html][Closer to MOP]] |
493 | |
494 | Compatibility layer that attempts to present the *Art of the Metaobject |
495 | Protocol* MOP specification properly in as many Common Lisp |
496 | implementation as possible. |